127 research outputs found
Unleashing the Power of Distributed CPU/GPU Architectures: Massive Astronomical Data Analysis and Visualization case study
Upcoming and future astronomy research facilities will systematically
generate terabyte-sized data sets moving astronomy into the Petascale data era.
While such facilities will provide astronomers with unprecedented levels of
accuracy and coverage, the increases in dataset size and dimensionality will
pose serious computational challenges for many current astronomy data analysis
and visualization tools. With such data sizes, even simple data analysis tasks
(e.g. calculating a histogram or computing data minimum/maximum) may not be
achievable without access to a supercomputing facility.
To effectively handle such dataset sizes, which exceed today's single machine
memory and processing limits, we present a framework that exploits the
distributed power of GPUs and many-core CPUs, with a goal of providing data
analysis and visualizing tasks as a service for astronomers. By mixing shared
and distributed memory architectures, our framework effectively utilizes the
underlying hardware infrastructure handling both batched and real-time data
analysis and visualization tasks. Offering such functionality as a service in a
"software as a service" manner will reduce the total cost of ownership, provide
an easy to use tool to the wider astronomical community, and enable a more
optimized utilization of the underlying hardware infrastructure.Comment: 4 Pages, 1 figures, To appear in the proceedings of ADASS XXI, ed.
P.Ballester and D.Egret, ASP Conf. Serie
Three-dimensional shapelets and an automated classification scheme for dark matter haloes
We extend the two-dimensional Cartesian shapelet formalism to d-dimensions.
Concentrating on the three-dimensional case, we derive shapelet-based equations
for the mass, centroid, root-mean-square radius, and components of the
quadrupole moment and moment of inertia tensors. Using cosmological N-body
simulations as an application domain, we show that three-dimensional shapelets
can be used to replicate the complex sub-structure of dark matter halos and
demonstrate the basis of an automated classification scheme for halo shapes. We
investigate the shapelet decomposition process from an algorithmic viewpoint,
and consider opportunities for accelerating the computation of shapelet-based
representations using graphics processing units (GPUs).Comment: 19 pages, 11 figures, accepted for publication in MNRA
Accelerating incoherent dedispersion
Incoherent dedispersion is a computationally intensive problem that appears
frequently in pulsar and transient astronomy. For current and future transient
pipelines, dedispersion can dominate the total execution time, meaning its
computational speed acts as a constraint on the quality and quantity of science
results. It is thus critical that the algorithm be able to take advantage of
trends in commodity computing hardware. With this goal in mind, we present
analysis of the 'direct', 'tree' and 'sub-band' dedispersion algorithms with
respect to their potential for efficient execution on modern graphics
processing units (GPUs). We find all three to be excellent candidates, and
proceed to describe implementations in C for CUDA using insight gained from the
analysis. Using recent CPU and GPU hardware, the transition to the GPU provides
a speed-up of 9x for the direct algorithm when compared to an optimised
quad-core CPU code. For realistic recent survey parameters, these speeds are
high enough that further optimisation is unnecessary to achieve real-time
processing. Where further speed-ups are desirable, we find that the tree and
sub-band algorithms are able to provide 3-7x better performance at the cost of
certain smearing, memory consumption and development time trade-offs. We finish
with a discussion of the implications of these results for future transient
surveys. Our GPU dedispersion code is publicly available as a C library at:
http://dedisp.googlecode.com/Comment: 15 pages, 4 figures, 2 tables, accepted for publication in MNRA
The size of a quasar's mid-IR emission region inferred from microlensed images of Q2237+0305
We use published mid-IR and V-band flux ratios for images A and B of
Q2237+0305 to demonstrate that the size of the mid-IR emission region has a
scale comparable to or larger than the microlens Einstein Radius (ER, ~10^17 cm
for solar mass stars). Q2237+0305 has been monitored extensively in the R and
V-bands for ~15 years. The variability record shows significant microlensing
variability of the optical emission region, and has been used by several
studies to demonstrate that the optical emission region is much smaller than
the ER for solar-mass objects. For the majority of the monitoring history, the
optical flux ratios have differed significantly from those predicted by
macro-models. In contrast, recent observations in mid-IR show flux ratios
similar to those measured in the radio, and to predictions of some lens models,
implying that the mid-IR flux is emitted from a region that is at least 2
orders of magnitude larger than the optical emission region. We have calculated
the likeli-hood of the observed mid-IR flux ratio as a function of mid-IR
source size given the observed V-band flux ratio. The expected flux ratio for a
source having dimensions of ~1 ER is a sensitive function of the macro model
adopted. However we find that the probability of source size given the observed
flux ratios is primarily sensitive to the ratio of the macro-model
magnifications. The majority of published macro models for Q2237+0305 yield a
flux ratio for images B and A of 0.8 - 1.1. By combining probabilities from the
ratios A/B and C/D we infer that the diameter of a circular IR emission region
is >1ER with >95% confidence. For microlensing by low-mass stars, this source
size limit rules out non-thermal processes such as synchrotron as mechanisms
for mid-IR emission.Comment: 13 pages, 8 figures. To be published in MNRA
Survey-scale discovery-based research processes: Evaluating a bespoke visualisation environment for astronomical survey data
Next generation astronomical surveys naturally pose challenges for
human-centred visualisation and analysis workflows that currently rely on the
use of standard desktop display environments. While a significant fraction of
the data preparation and analysis will be taken care of by automated pipelines,
crucial steps of knowledge discovery can still only be achieved through various
level of human interpretation. As the number of sources in a survey grows,
there is need to both modify and simplify repetitive visualisation processes
that need to be completed for each source. As tasks such as per-source quality
control, candidate rejection, and morphological classification all share a
single instruction, multiple data (SIMD) work pattern, they are amenable to a
parallel solution. Selecting extragalactic neutral hydrogen (HI) surveys as a
representative example, we use system performance benchmarking and the visual
data and reasoning (VDAR) methodology from the field of information
visualisation to evaluate a bespoke comparative visualisation environment: the
encube visual analytics framework deployed on the 83 Megapixel Swinburne
Discovery Wall. Through benchmarking using spectral cube data from existing HI
surveys, we are able to perform interactive comparative visualisation via
texture-based volume rendering of 180 three-dimensional (3D) data cubes at a
time. The time to load a configuration of spectral cubes scale linearly with
the number of voxels, with independent samples of 180 cubes (8.4 Gigavoxels or
34 Gigabytes) each loading in under 5 minutes. We show that parallel
comparative inspection is a productive and time-saving technique which can
reduce the time taken to complete SIMD-style visual tasks currently performed
at the desktop by at least two orders of magnitude, potentially rendering some
labour-intensive desktop-based workflows obsolete.Comment: 21 pages, 10 figures, Accepted for publication in the Publications of
the Astronomical Society of Australi
Teraflop per second gravitational lensing ray-shooting using graphics processing units
Gravitational lensing calculation using a direct inverse ray-shooting
approach is a computationally expensive way to determine magnification maps,
caustic patterns, and light-curves (e.g. as a function of source profile and
size). However, as an easily parallelisable calculation, gravitational
ray-shooting can be accelerated using programmable graphics processing units
(GPUs). We present our implementation of inverse ray-shooting for the NVIDIA
G80 generation of graphics processors using the NVIDIA Compute Unified Device
Architecture (CUDA) software development kit. We also extend our code to
multiple-GPU systems, including a 4-GPU NVIDIA S1070 Tesla unit. We achieve
sustained processing performance of 182 Gflop/s on a single GPU, and 1.28
Tflop/s using the Tesla unit. We demonstrate that billion-lens microlensing
simulations can be run on a single computer with a Tesla unit in timescales of
order a day without the use of a hierarchical tree code.Comment: 21 pages, 4 figures, submitted to New Astronom
The Ray Bundle method for calculating weak magnification by gravitational lenses
We present here an alternative method for calculating magnifications in
gravitational lensing calculations -- the Ray Bundle method. We provide a
detailed comparison between the distribution of magnifications obtained
compared with analytic results and conventional ray-shooting methods. The Ray
Bundle method provides high accuracy in the weak lensing limit, and is
computationally much faster than (non-hierarchical) ray shooting methods to a
comparable accuracy.
The Ray Bundle method is a powerful and efficient technique with which to
study gravitational lensing within realistic cosmological models, particularly
in the weak lensing limit.Comment: 9 pages Latex, 8 figures, submitted to MNRA
Analysing Astronomy Algorithms for GPUs and Beyond
Astronomy depends on ever increasing computing power. Processor clock-rates
have plateaued, and increased performance is now appearing in the form of
additional processor cores on a single chip. This poses significant challenges
to the astronomy software community. Graphics Processing Units (GPUs), now
capable of general-purpose computation, exemplify both the difficult
learning-curve and the significant speedups exhibited by massively-parallel
hardware architectures. We present a generalised approach to tackling this
paradigm shift, based on the analysis of algorithms. We describe a small
collection of foundation algorithms relevant to astronomy and explain how they
may be used to ease the transition to massively-parallel computing
architectures. We demonstrate the effectiveness of our approach by applying it
to four well-known astronomy problems: Hogbom CLEAN, inverse ray-shooting for
gravitational lensing, pulsar dedispersion and volume rendering. Algorithms
with well-defined memory access patterns and high arithmetic intensity stand to
receive the greatest performance boost from massively-parallel architectures,
while those that involve a significant amount of decision-making may struggle
to take advantage of the available processing power.Comment: 10 pages, 3 figures, accepted for publication in MNRA
- …